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Abstract 

This study describes how bkaR, a highly conserved mycobacterial TetR-like 
transcriptional repressor, regulates a number of nearby genes that have associa- 
tions with branched-chain keto-acid metabolism. bkaR (MSMEG_4718) was 
deleted from the nonpathogenic species Mycobacterium smegmatis, and changes 
in global gene expression were assessed using microarray analysis and reporter 
gene studies. bkaR was found to directly control the expression of 10 genes in 
M. smegmatis, and its ortholog in Mycobacterium tuberculosis (Rv2506) is pre- 
dicted to control at least 12 genes. A conserved operator motif was identified, 
and binding of purified recombinant M. tuberculosis BkaR to the motif was 
demonstrated. Analysis of the stoichiometry of binding showed that BkaR 
binds to the motif as a dimer. 
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Introduction 

TetR regulators are abundant in the Mycobacterium 
genus, which includes several pathogenic species, and 
comprise over 30% of the total DNA-binding regulators 
in both Mycobacterium smegmatis and Mycobacterium 
tuberculosis. In M. tuberculosis, the few whose regulons 
have been described include Mce3R, a repressor that con- 
trols an operon involved in lipid metabolism (de la Paz 
Santangelo et at, 2009), EthR, involved in the control of 
the activation of the pro-drug ethionamide (Baulard 
et al., 2000) and KstR and KstR2, involved in the control 
of cholesterol catabolism (Kendall et al, 2007, 2010). 
Although many of the regulons defined to date are con- 
served in the environmental species M. smegmatis, their 
importance in the life style of the pathogenic species 
M. tuberculosis is clear. The Mce3R regulon has been 
implicated in the survival of M. tuberculosis in mice 
(Senaratne et al., 2008) as have many of the genes in the 



KstR and KstR2 regulons (Hu et al, 2009; Nesbitt et al, 
2009; Griffin et al, 2011). 

TetR regulators often bind to palindromic motifs in 
operators using the N-terminal end of the protein to 
repress transcription. In the presence of a ligand that 
binds to the C-terminus, the regulator is removed from 
the operator allowing access to RNA polymerase and 
transcription. Here, we focus on a mycobacterial TetR 
regulator, which we name branched-chain keto-acid regula- 
tor (BkaR). This is highly conserved in both pathogenic 
and nonpathogenic Mycobacteria and is encoded by the 
gene Rv2506 in M. tuberculosis and MSMEG_4718 in 
M. smegmatis. BkaR was previously found to have poten- 
tial associations with pathogenesis in a whole-genome 
transposon screen (Stewart et al, 2005). We show that 
BkaR Msm controls the expression of 10 genes likely to be 
involved in branched-chain keto-acid metabolism. We 
show relevance to M. tuberculosis through protein- 
binding experiments, regulatory motif analysis and reporter 
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gene assays. The potential role of the BkaR regulon is 
discussed. 

Materials and methods 

Bacterial strains and culture conditions 

The strains and plasmids used in this study are described 
in Supporting Information, Table SI and were grown as 
described previously (Kendall et al, 2010). 

Microarray analysis of M. smegmatis AbkaR Msm 

AbkaR Msm was available from a previous study (Balhana 
et al, 2010). RNA from both wild-type and AbkaR Msm 
strains was prepared by direct sampling into guanidine 
thiocyanate (GTC) followed by the use of the RNAeasy 
kit (Qiagen) as previously described (Kendall et at, 
2010). Microarrays were provided by the Pathogen Func- 
tional Genomics Resource Centre at TIGR (http://pfgrc. 
jcvi.org). cDNA from wild-type and mutant strains were 
labelled and competitively hybridized onto the arrays. All 
methods including scanning, data analysis and signifi- 
cance criteria were as previously described (Kendall et al, 
2010). Fully annotated microarray data have been depos- 
ited in BuG@Sbase (accession number E-BUGS-116; 
http://bugs.sgul.ac.uk/E-BUGS-116) and also ArrayExpress 
(accession number E-BUGS-116). 

RT-PCR for analysis of operon structure 

RNA isolated from the AbkaR Msm mutant was treated 
twice with DNase I (Invitrogen) (30 min at 37 °C). The 
reaction was inactivated at 65 °C for 10 min in the pres- 
ence of 1.25 mM EDTA. DNA-free RNA (150 ng) was 
mixed with 300 ng of random primers, 10 mM DTT, 
0.5 mM dNTPs and reversed-transcribed with 200 units 
of Superscript III reverse transcriptase (Invitrogen) 
according to the manufacturer's instructions. Control 
reactions were performed without reverse transcriptase. 
PCR was carried out using Phusion® High-Fidelity PCR 
Master Mix with GC buffer (New England Biolabs). 

Reporter analysis 

This was performed using the lacZ integrative reporter 
construct pEJ414 (Papavinasasundaram et al, 2001). 
Upstream intergenic regions of genes of interest were either 
PCR-amplified or synthesized as oligonucleotides (Table 
S2), cloned using NotI and Xbal and electroporated into 
wild-type and AbkaR Msm strains. Reporter assays were 
carried out as described previously (Papavinasasundaram 
et al, 2001). 



Expression and purification of recombinant 
BkaRMtb 

bkaR Mth was PCR-amplified from M. tuberculosis H37Rv 
genomic DNA (Table S2) and inserted into pNIC28-Bsa4 
(GenBank accession no. EF198106) using ligation-inde- 
pendent cloning with T4 DNA polymerase. The final 
construct, pNbkaR-MTB was transformed into Escherichia 
coli BL21(DE3). Expression and induction were achieved 
in 400 mL of autoinducing medium (Studier, 2005) at 
37 °C overnight. Cultures were harvested by centrifugation 
(30 min, 3070 g, 4 °C), and the pellet was resuspended in 
5 mL of lysis buffer (70 mM HEPES, 20 mM imidazole, 
650 mM NaCl, 0.5 mM P-mercaptoethanol, 10% glycerol, 
pH 8). Cells were lysed on ice using sonication (Soniprep 
150) at 20 urn f° r 5 mm with 30-s rest period every 
1 min. Soluble fractions were isolated through centrifuga- 
tion (11 337 g, 20 min, 4 °C), and His 6 -BkaR Mtb was 
purified using immobilized metal ion affinity chromatog- 
raphy on HisTrap FF Ni-Sepharose columns (GE Health- 
care Life Sciences). The protein was eluted with histidine 
elution buffer (250 mM histidine, 60 mM HEPES, 
150 mM NaCl, 3% glycerol, pH 8). 

Electrophoretic mobility shift assays 

DNA oligonucleotides or PCR amplicons were used to 
assay the binding of His 6 -BkaR Mtb Probes used for com- 
petition assays were end-labelled with DIG-ll-ddUTP 
using the DIG gel shift kit, 2nd generation (Roche). Bind- 
ing reactions were performed by incubating varying con- 
centrations of protein with 0.03 pmol of labelled probe in 
IX binding buffer (20 mM HEPES, 1 mM EDTA, 10 mM 
(NH 4 ) 2 S0 4 , 1 mM DTT, 0.2% Tween-20, 30 mM KC1, 
pH 7.6) together with 0.1 ug of poly-L-lysine and 1 ug of 
poly[d(I-C)] in a total volume of 20 uL. Specific compet- 
itor (nonlabelled probe) was added in 150-fold excess. 
Mixtures also contained an excess 125-fold poly [d(I-C)] 
(nonspecific competitor). Reactions were incubated, sepa- 
rated by polyacrylamide gel electrophoresis and blotted 
onto Hybond-N nylon membranes as previously 
described (Kendall et al, 2010). The membrane was 
washed according to the manufacturer's instructions and 
detection followed the chemiluminescent method using 
anti-DIG-alkaline phosphatase and the substrate CSPD. 
The luminescent membranes were exposed to X-ray film 
for varying time points between 5 min and 1 h. 

Determination of the binding stoichiometry of 
the His 6 BkaR Mtb DNA complex 

A Ferguson plot assay was used to determine the molecu- 
lar weight of the protein-DNA complex (Orchard & May, 
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1993). Electrophoretic mobility shift assays (EMSA) reac- 
tions were run alongside protein standards through a ser- 
ies of gels differing in acrylamide concentration (range of 
6-12%). Gels were run at 10 V cnT 1 in 0.4X TBE buffer, 
and once finished, the distance migrated by the brom- 
ophenol blue was measured for each sample. Each gel was 
cut in half, isolating the tracks containing the protein- 
DNA complex from the tracks with protein standards. 
The gel fragments containing DNA were stained for 
30 min at room temperature in 0.4X TBE with 
0.68 ug mL _1 ethidium bromide, whereas the gel fragments 
containing the protein standards were Coomassie-stained 
(30% methanol, 8% acetic acid, 0.25% w/v Coomassie 
blue -R). The distances migrated by the protein-DNA 
complexes and by each standard were measured and 
divided by the distance migrated by the bromophenol 
blue, giving the relative mobility (_R f ) for each species. 
The logarithms of the relative mobilities were plotted 
against gel concentration, and the retardation coefficients 
(K n slopes of the trend curves) were calculated and plot- 
ted again vs. the molecular weight of each standard. The 
Ferguson plot obtained allowed determination of the 
molecular weight of the complex. 



Bioinformatic analyses 

Genome sequences were compared using Artemis Compa- 
rision Tool (ACT) (Carver et al, 2005), and using clu- 
stalw (Thompson et al, 1994), operator motif discovery 
and analysis were carried out using MEME and MAST 
(Bailey & Elkan, 1994; Bailey & Gribskov, 1998). 



Results 

Genomic analyses of the bkaR region show 
conservation in Mycobacteria and closely 
related species 

Using genomic alignment, bkaR was found to be clearly 
conserved in M. tuberculosis, M. bovis BCG, Mycobacterium 
marinum, Mycobacterium ulcerans, Mycobacterium avium, 
M. smegmatis, Rhodococcus jostii and Norcardia farcinica 
(Fig. 1). The percentage amino acid sequence identities 
between BkaR in M. tuberculosis and its orthologs were 
> 50% in all cases. Most neighbouring genes were also con- 
served, although fadD35, scoA and scoB were only present 
in species more closely related to M. tuberculosis. 
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Fig. 1. Genomic context of bkaR in Mycobacteria and close relatives. The grey shading represents regions of conservation between genomes. 
Arrows represented with a dark grey background correspond to genes that are conserved amongst all species, whereas genes that find no 
ortholog in at least one species are shown with white arrows. The light grey arrow inserted in MUL_3785 represents a transposon insertion. 
Asterisks represent binding motifs identified by the programs MEME and MAST, and the sizes of the intergenic gaps in Mycobacterium smegmatis 
and Mycobacterium tuberculosis are indicated. Genes that were derepressed in the AbkaR Msm mutant are highlighted by thickened lines. 
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BkaR is autoregulatory and binds to a 
conserved palindromic motif within its own 
promoter 

TetR regulators are generally autoregulatory. Measure- 
ment of expression from the bkaR Msm and bkaR Mi t, pro- 
moters in lacZ reporter constructs in wild-type 
M. smegmatis and AbkaR Msm showed that transcription 
was approximately fourfold higher in the strain lacking 
the bkaR regulator, indicating that both bkaR Mtb and 
bkaR Msm are autoregulatory and repress their own 
expression (Fig. 2). 

Transcriptional repressors of the TetR family tend to 
bind to palindromic DNA motifs (Yu et al, 2010). Com- 
putational analysis of the bkaR promoter using MEME 
identified a 24-bp palindromic motif with a highly con- 
served 16-bp core GTTA(N)T(N4)A(N)TAAC that was 
present twice in the promoter regions of all the species 
tested (Fig. 3a). The promoter regions of some of the 
Mycobacteria and Nocardia species were aligned to iden- 
tify the position of the motif (Fig. 3b). Both copies of the 
motif were clearly visible with the 16-bp core being more 
conserved than the rest of the intergenic region. Addi- 
tionally, there was another region that showed more con- 
servation. This could possibly be the —35 site as it is a 
short conserved GC-rich region as seen in other myco- 
bacterial promoters (Gomez & Smith, 2000). 

To test if BkaR binds to the two motifs, 30-bp double- 
stranded probes containing the motifs were used in 
EMSAs. The probes were DIG-labelled, and the binding 



assays were carried out in the presence of 125-fold excess 
of nonspecific competitor poly[d(I-C)] and with specific 
competition with unlabelled probe (Fig. 3c). Clear shifts 
were seen in the presence of the nonspecific competitor 
at all molar ratios. At a 1 : 10 molar ratio of DNA/pro- 
tein, the presence of excess unlabelled probe successfully 
competed with the labelled probe (lanes marked with an 
asterisk). BkaR appears to show lower affinity for probe 2 
(motif near fadD35 in M. tuberculosis and accDl in 
M. smegmatis) as a complete shift was not achieved for 
any of the DNA/protein ratios, whereas with probe 1, a 
complete shift was observed at a 1 : 2 molar ratio. 

BkaR binds to the motif as a dimer 

The stoichiometry of binding of BkaR Mtb with the motif 
DNA was examined using Ferguson plot analysis (Ferguson, 
1964; Orchard & May, 1993). The logarithms of the relative 
mobility (_Rf) of protein standards and BkaR Mt t,-DNA 
complex were plotted against the percentage gel concentra- 
tion (Data SI), and the slopes (retardation coefficient) 
were calculated. The retardation coefficients were subse- 
quently plotted as a function of molecular weight. Using 
the equation of the adjusted curve obtained, we estimated 
the molecular weight of the complex to be 76.9 kDa. 
Subtracting the molecular weight of the 30-bp DNA oligo- 
nucleotide used (18.4 kDa), the mass accounted for the 
protein alone is 58.5 kDa, which approximately corre- 
sponds to the mass of an His 6 -BkaR Mtb dimer (monomer 
is 25.8 kDa). 




Promotorless control bkaR_MTB promoter bkaR_MSM promoter 

Fig. 2. Expression of the bkaR Mt b and bkaR Msm promoter in wild-type 
mc 2 155 and AbkaRusm mutant background. Beta-galactosidase 
activity was calculated in Miller Units (measured as the amount of 
O-nitrophenol produced (nmol) over time (min) per mg of protein). 
Deletion of bkaR Msm causes a statistically significant increase in the 
expression of lacZ from the bkaR M tb and the bkaR Msm promoters 
(unpaired Student's f-test; P < 0.05, *). The values result from means 
of biological triplicates and technical duplicates (this experiment was 
repeated independently with similar results). Error bars represent 
standard deviations. 



Genomewide analysis of the presence of the 
motif in M. smegmatis and M. tuberculosis and 
global gene expression analyses 

Computational analyses for further instances of the motif 
using MAST showed that there were four more instances 
of the motif in the M. smegmatis genome and three more 
instances in M. tuberculosis (Table 1). In M. smegmatis, 
these were found upstream of MSMEG_4920, MSMEG_4524 
and between the divergently oriented MSMEG_3414/ 
MSMEG_3415 (two copies). However, the significances of 
these motifs were much lower than the motif originally 
found within the bkaR promoter. DNA-binding studies 
provided experimental corroboration for the motif 
upstream of MSMEG_4920 only (Data S3). 

In M. tuberculosis, the additional motifs were found 
upstream of Rv2503c (scoA) and between the divergently 
oriented Rv0575c/Rv0576 (two copies). The putative motif 
upstream of scoA (which lies close to bkaR in the 
M. tuberculosis genome, but is not present in M. smegmatis; 
see Fig. 1) had comparable significance to the original 
motifs identified by MEME, and we were able to observe 
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(a) 




r- r~ r- ^ ^ ^ ^ »- T- »- <N CN CM N N 

(b) =, 

M._tuberculosis gcggctgccacGGTTCCTCCGCCTGGTGCCTACCGCATGTCTGACTCGC 4 9 

M._bovis_BCG ctgc cac GGTTCCTCCGCCTGGTGCCTACCGCATGTCTGACTCGC 45 

M._marinum tctgcggcggcggtggttgc cac GGCCCCT-CCGTCTATTTGCATCACGCGG 51 

M._ulcerans cagcggtggttgc cac GGCCCCT-CCGTCTATTTGCATCACGCGG 44 

M._smegmatis gatga caa GGCGTCTCCATCTG 22 

N._farcinica ggtgatcgt catCCCGCCTCGCTTCTCGAATGGAGGTGGAGCGGCCGATCCGGCCGATCC 60 



M._tuberculosis -GTTAACATCGAATAGCTCGTGCT-ACGTTAGTGACGATTAACCGAAGTGTCCAGCATGA 10 7 

M . _bovi s_BCG -GTTAACATCGAATAGCTCGTGCT- ACGTTAGTGACGATTAACCGAAGTGTCCAGCATGA 10 3 

M._marinum AGTTAATATCGACTAACTCGTGTT-ACGTTAGTGACGATTAACCGAAGTGTCCAGGACGA 110 

M._ulcerans AGTTAATATCGACTAACTCGIGTT-ACGTTAGTGACGATTAACCGAAGTGTCCAGGACGA 10 3 

M._smegmatis AGTTAATGACGACTAACTCGTGGT-ACGTTAGTCACGATTAACCGAGTTGTCCA-TACCC 8 0 

N._farcinica AGTTAGTCGCCGATAACTGAGATTCAGGTTAATCTTGATTAACTGAGTTGTCCAGGCCCG 120 

* * * -k -k -k -k -k -k -k * -k -k -k -k * -k -k -k -k -k -k -k -k -k * * * -k -k -k 
< > < > 



M._tuberculosis GTCGTGTACGGAGACCGTCgtgacagcgtc 137 

M._bovis_BCG GTCGTGTACGGAGACCGTC gtg acagc 130 

M._marinum GT — TGGACGGAGGCCGTC atg gcagcgtccgccc 143 

M._ulcerans GT — TGGACGGAGGCCGTC atg gcagcgtccg 133 

M._smegmatis CT-ACCTCCGGAGCCCGTT atg tcgt 105 

N._farcinica G GTCAGGGGC gtg tggga 138 




1:1 1:2 1:6 1:10 1:20 1:10* _ 1:1 1:2 1 :6 1:10 1:20 1:10* 

Probe 1 Probe 2 



Fig. 3. (a) Sequence logo of the putative binding motif constructed from the intergenic regions of bkaR orthologs in Mycobacteria and close 
relatives. Sequence logos illustrate the relative frequency of each nucleotide in a given position. The f-value of the motif is 1.6e~ 20 , this is an 
estimate of the number of motifs (of equal or higher interest) expected to be found by chance if the letters in the input sequences were shuffled, 
(b) ClustalW alignment of the bkaR Mt tJfadD35 (bkaR Msm /accD1 for Mycobacterium smegmatis) intergenic region. Asterisks indicate conservation of 
residues in all genomes. An inverted palindrome is observed in two copies in all species and is represented in bold and with arrows. The location of 
the probes used in the EMSAs is illustrated above the sequence. Lower-case letters represent coding sequences of the two divergently oriented 
genes for each instance, and start codons are underlined for each gene. The area shaded in grey is a putative -35 site, (c) EMSA showing specific 
binding of His 6 -BkaR Mtb to DIG-labelled probes 1 and 2 in the presence of 1 25-fold excess of the nonspecific competitor poly[d(l-C)]. DNA was 
incubated with protein in a variety of molar ratios as labelled at the bottom of each lane. - without protein. An asterisk shows the lanes where 
specific competition with 150-fold unlabelled probe took place (these lanes should be compared with the respective molarity ratio without specific 
competition). 
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Table 1. Occurrences of the motif in Mycobacterium smegmatis and Mycobacterium tuberculosis 



Motif sequence 



F-value 



Flanking genes 



EMSA 



Reporter assay 



Microarray analysis 



M. smegmatis 

CTC G GTTA ATC GTG ACTAACGTAC 
CTG AGTJAATG AC GAC TAAC TC GT 
GC C GGTTAATC ATC GTTAACAC AG 
CCAG GTTA GCCAAAACTATTTCAG 
CTGAAATAGTTTTG G CTAACCTG G 
CTAG GTTA G G C T AC ATTTACTTG C 
M. tuberculosis 

CTAC GTTA GTG AC GAT TAAC C G AA 
TTC G GTTA AC ATC G AATAG CTTAG 
CTCA GTTA ATGATAAT TAAC TGAA 
ATAC GTTA G C G CTC ACTAACGTAT 
ATACGTTAGTGAGCGCTAACGTAT 



1 .4e-05 


bkaR Msm /MSMEG_4717 


+ 


+ 


+ 


3.9E-05 


MSMEG 4717/bkaR.. 


-(- 


ND 


4- 


0.6 


MSMEG_4920 


+ 


ND 


+ 


2 


MSMEG_3414/MSMEG_34 1 5 








3.4 


MSMEG _34 1 4/MSMEG_34 15 








8.2 


MSMEG_4524 








6.1e-06 


bkaR Mtb /fadD35 


+ 


+ 


ND 


1 . 1 e-03 


bkaR Mtb /fadD35 


+ 


ND 


ND 


9.8E-06 


scoA 


+ 


ND 


ND 


2.7 


Rv0575c/Rv0576 






ND 


3.1 


Rv0575c/Rv0576 






ND 



ND, not done; +, complete shift or derepression in the case of reporter and microarray analysis; -, no shift observed or no derepression of gene 
expression; underlined nucleotides represent conserved base pairs of the motif; gene numbers in bold represent the ones that are closer to the 
motif for divergent arrangements. 



Table 2. The regulon of bkaR Msm 

Mycobacterium smegmatis Fold change P-value Gene name Ortholog in Mtb Annotated function 



MSMEG_5576 


3.4 


2.2E-02 






D-mannonate oxidoreductase 


MSMEG_4920 


4.9* 


7.9E-04 


fadA4 


Rv1323 


Acetyl-CoA acetyltransferase 


MSMEG_4718 


-1.8* 


4.8E-01 


bkaR 


Rv2506 


TetR transcriptional regulator (bkaR)* 


MSMEG_4717 


9.0* 


3.7E-05 


accD1 


Rv2502c 


Acetyl/propionyl-coenzyme A carboxylase (p subunit) 


MSMEG_4716 


15.0 


1.5E-05 


accA1 


Rv2501c 


Acetyl/propionyl-coenzyme A carboxylase (a subunit) 


MSMEG_471S 


13.7 


5.8E-06 


fadE19 


Rv2500c 


Acyl-CoA dehydrogenase 


MSMEG_4714 


9.5 


1.1E-04 


Rv2499c 


Rv2499c 


Hydratase 


MSMEG_4713 


16.5 


4.6E-06 


citE 


Rv2498c 


HpcH/Hpal aldolase/citrate lyase family protein 


MSMEG_4712 


8.4 


1 . 1 E-04 


bkdA 


Rv2497c 


Part of branched-chain keto-acid dehydrogenase complex 1 


MSMEG_4711 


3.3 


3.3E-02 


bkdB 


Rv2496c 


Part of branched-chain keto-acid dehydrogenase complex 1 


MSMEG_4710 


1.8 


4.9E-01 


bkdC 


Rv2495c 


Part of branched-chain keto-acid dehydrogenase complex 1 " 


MSMEG_4005 


-3.2 


2.5E-02 






Calcium-binding protein 


MSMEG_2080 


3.1 


3.3E-02 


fadE23 


Rv3140 


Putative acyl-CoA dehydrogenase 


MSMEG _1 885 


3.3 


2.2E-02 


Rv3230c 


Rv3230c 


Iron-sulphur cluster binding domain protein 


MSMEG J 548 


3.0 


4.3E-02 






Dehydratase (propanediol utilization) 


MSMEG J 543 


3.2 


2.3E-02 


Rv0458 


Rv0458 


EPTC-lnducible aldehyde dehydrogenase 


MSMEG _0881 


-3.4 


2.2E-02 






Hypothetical protein 


MSMEG_0066 


3.3 


2.2E-02 


esxA 


Rv3875 


Early secretory antigenic target 



The genes in bold are those directly regulated by bkaR. 
'Occurrence of the motif. 
Venugopal et at. (201 1). 

binding of purified BkaR Mtb to this motif using EMSA 
(Data S3). However, the occurrences of the motifs in the 
intergenic region between Rv0575c and Rv0576 were less 
significant, and no DNA binding was observed (data not 
shown). These results indicate that the motifs upstream 
of MSMEG_4920 and scoA are bound by BkaR but those 
identified upstream of the other genes are either not 
functional binding sites or motifs for an unidentified 
transcriptional regulator. 

Global gene expression changes as a result of deleting 
bkaR were examined by microarray analysis of wild-type 
and AbkaR Msm strains. A total of 14 genes were found to 



be significantly derepressed for a P-value cut-off of 0.05 
(Table 2). Additionally, MSMEG_4710 (bkdC) was also 
derepressed but did not meet the significance criteria. 
Although this gene did not meet the significance cut-off, 
it was 1.8-fold derepressed, and other evidence suggests 
that it is part of the regulon; therefore, it was included in 
the table (see below). 

The highest fold change (> 8) was observed in genes 
that are part of the operon divergently oriented to 
bkaR Msm (accDl-citE) (Fig. 1). With the exception of an 
8-bp gap between fadE19 and MSMEG_4714, the genes 
accDl-citE are contiguous and are likely to be expressed 
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as a single transcript. Therefore, the microarray data sug- 
gest that, as predicted by the presence of the motif, the 
regulator controls the expression of the divergently tran- 
scribed genes between accDl-citE in M. smegmatis. This 
correlates with the observation that TetR regulators often 
control adjacent genes (Ahn ef al, 2012). 

The genes between bkdA and bkdC were also dere- 
pressed in the mutant (although the last gene bkdC was 
not significant at the chosen P- value cut-off). This was 
unexpected because of the presence of a 211-bp gap 
upstream of bkdA with no associated regulatory motif. 
Therefore, RT-PCR was used to assess whether bkdABC 
were cotranscribed with the upstream genes. Reactions 
were performed with primers annealing across runs of 
genes between fadEl 9 and bkdA incorporating the 211-bp 
gap (Data S2). The presence of a band of the expected 
size in the samples that were reverse-transcribed samples 
only (+) supports the microarray data and indicates that 
these genes are indeed cotranscribed. Large gaps 
(> 200 bp) have been reported to separate genes that 
form an operon in other bacterial species (Krause ef al, 
1992). MSMEG_4920 was also significantly up-regulated 
with over fourfold change and has a proximal motif to 
which the regulator bound (Data S3). These combined 
evidence strongly suggest that MSMEG_4920 is also 
directly under the control of the regulator. 

The remaining genes that are either significantly dere- 
pressed (MSMEG_5576, MSMEG_2080, MSMEG_1885, 
MSMEG_1548, MSMEG_1543, MSMEG_0066) or signifi- 
cantly repressed (MSMEG_4005, MSMEG_0881) are not 
associated with a motif. We conclude that these are not 
likely to be under the direct control of BkaR. The changes 
in expression in these genes may be an indirect effect of 
knocking out the regulator. 

Discussion 

The bkaR regulon in M. smegmatis and 
prediction of the regulon in M. tuberculosis 

This study shows that the highly conserved transcriptional 
regulator bkaR binds to a 16-bp palindromic motif and 
to act as a repressor to directly control expression of 
itself, and of the divergently oriented operon (Fig. 1). In 
M. smegmatis, this consists of the genes from accDl-bkdC. 
Additionally, bkaR Msm controls the expression of 
MSMEG_4920, which is de-repressed in the microarray 
analysis and has an associated upstream motif. 

While writing this manuscript, a paper was published 
describing a single motif in the intergenic region of 
bkaR_fadD35 in M. tuberculosis. Binding was demon- 
strated to this region together with affinity measurements, 
and it was shown that both tetracycline and palmityl-coA 



could interfere with binding (Anand ef al, 2012). These 
authors suggested that bkaR (which they call fad35R and 
consider to be a homologue of E. coli FadR) controls the 
expression of fadD35 in M. tuberculosis. However, the 
palindrome described by Anand ef al. was only partially 
identified (the authors describe a diffuse and poorly 
conserved palindrome), only a single motif in the 
bkaR_fadD35 region was described, the motif upstream of 
scoA was not identified, and no gene expression studies 
were carried out to support the work. In contrast, our 
data clearly show the presence of two palindromic motifs 
in the bkaR-fadD35 intergenic region, a motif upstream 
of scoA, and we provide gene expression data to support 
the identification of the regulon in M. smegmatis. 

Gene expression studies in combination with motif 
analysis have previously allowed us to use data from 
M. smegmatis to predict the regulon in M. tuberculosis, 
and these predictions have been subsequently experimen- 
tally verified (Kendall ef al, 2007; Nesbitt ef al, 2009). 
Similarly, in this study, we predict that in addition to the 
orthologs shown in Table 2, fadD35 and the cotranscribed 
genes scoA-scoB will be controlled by BkaR in M. tubercu- 
losis, and we show that BkaR Mtb binds to the motifs 
upstream of fadD35 and scoA. The presence of the two 
motifs in the bkaR_fadD35 intergenic region strongly sug- 
gests that bkaR acts like the paradigm TetR where the 
repressor binds as a dimer to each motif to repress 
expression in both directions (Hillen & Berens, 1994). 

Possible functions of the genes in the bkaR 
regulon 

Many of the genes in the regulon have annotated func- 
tions that could be involved in P-oxidation. However, 
much of the M. tuberculosis genome is dedicated to fatty 
acid P-oxidation, and it is often difficult to work out the 
precise substrates and exactly what role each of the 
enzymes play in the pathogens lifestyle. This is com- 
pounded by re-annotation of some of the genes in the 
regulon. The bkdABC genes were originally annotated as 
encoding for the pyruvate dehydrogenase complex 
(pdhABC), but have been found not to possess such activ- 
ity (Tian ef al, 2005). More recent work suggests that 
they form part of a complex that has branched-chain 
keto-acid dehydrogenase (BCKADH) activity, which is 
the second stage in the catabolism of the branched-chain 
amino acids leucine, valine and isoleucine (Venugopal 
ef al, 2011). 

This latter observation has led us to speculate that the 
other genes in the regulon also catalyse reactions in this 
pathway (Data S4). The branched-chain keto-acid deriva- 
tives are activated by the addition of coenzyme A (the 
product of BCKADH activity) and are then degraded in a 
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series of reactions that involve dehydrogenase (fadE19), 
hydratase (Rv2499c), and, specifically in the case of leucine, 
carboxylase (accAlaccDl) activity. The ultimate endpoint of 
the catabolism of branched-chain amino acids is acetyl-coA 
and propionyl-coA. Acetyl-coA can enter the glyoxylate 
cycle where it is used for energy generation, while propio- 
nyl-coA enters the methyl citrate cycle and the methyl 
malonyl pathway. Intermediates from these cycles can enter 
the TCA cycle; in this way, energy can be derived from the 
breakdown of branched-chain amino acids. Interestingly, 
other recent studies have also found evidence for the role 
of accAlaccDl in branched-chain amino acid catabolism in 
Mycobacteria (Ehebaur et al, 2013). 

The branched-chain keto-acid derivatives (namely 
isobutyryl-CoA and 2-methyl-butyryl-CoA) also act as 
precursors for branched-chain fatty acid synthesis, and so, 
it is possible that the other genes in the bkaR regulon are 
involved in a synthetic pathway. In this alternative 
scenario, the acc genes act as other biotin-dependent 
carboxylases used for the synthesis of multi-methyl- 
branched fatty acids that can be units for mycolic acids, 
as in the case of accD4 and accD5 in M. tuberculosis 
(Gande et al, 2004, 2007; Daniel et al, 2007). 

In conclusion, we have described the regulation of a set 
of genes likely to be involved in branched-chain keto-acid 
metabolism genes in Mycobacteria. The genes are 
repressed by an adjacent autoregulatory TetR regulator, 
bkaR, which binds to a conserved palindromic motif in 
its own promoter region and the promoter regions of the 
genes it controls. 
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